Quantum Monte Carlo: 2003

Friday, December 12, 2003

Reducing the Variance

Assaraf and Caffarel have published a couple of papers (JCP 119, 10536 and JCP 113, 4028) about reducing the variance for general estimators in QMC. The VMC energy has the well known zero variance property - the variance goes to zero as the trial wavefunction approaches an eigenstate. Through a "renormalization" , the authors show how to extend this property to quantities other than the energy.

The basic approach seems to be to add a term (with zero average) to the bare estimator. Then that added term can be optimized to reduce the variance. Their derivation starts with the Hellmann-Feynman theorem, but this seems confusing to me.

Why not start with the new estimator, O tilde = O + B. ("O tilde" and "O" are notations from the paper. "B" is my own notation for representing all the added terms) Then we demand < B > = 0, and use a Lagrange multiplier to enforce it while optimizing the variance (<(O+B)^2> - < O+B>^2)?

Their paper gives a particular form for B, that ensures the < B>=0 constraint is met (or at least that < B > vanishes as the trial wavefunction approaches the ground state).

Can this approach be used to reduce the variance in classical Monte Carlo as well? In principle, B could optimized in that case as well - the hard part is finding a special form as in Quantum Monte Carlo.

Added thoughts (12/14/2003): The primary application of this method is to forces, which are plagued by an infinite variance in QMC. This looks a like good way to improve computations of the force.

The paper looks at average values and variances as functions (or at least the polynomial order) of delta psi (the diference between the exact ground state and the trial wavefunction). For general (non-energy) estimators, the bias (or error) in the average value is linear in delta psi, and the variance is constant. Using their procedure, the error is quadratic and the variance is linear. It appears the added terms (B) cancel out the leading error terms in the estimator. The question is then, why stop at the leading order? Can a similar process by applied to cancel out the quadratic term in the average ??

Thursday, December 04, 2003

Free Energy Chronicles

Free energy calculations are annoyingly difficult. Computing averages with Metropolis is relatively easy,
but the free energy is hard to get. The usual method is thermodynamic integration over a path from a state of known free energy to the desired state. (Maybe it's the ensemble averages that are anomalously easy, making other quantities seem difficult by comparison) In addition, some other aspects about free energy calculations are bothering me as well.

The typical example for explaining Monte Carlo integration is to compute the area of an irregularly shaped field or pond by throwing rocks at it. But then computing the area (volume, free energy, whatever) is actually quite difficult in practice. Of course in this example, one has to compare the rocks falling in desired area to those falling in an enclosing area that can be computed analytically, so it's still computing the difference between two states.

Another issue that bothers my intuition is that if I were to wander around in a field I would have some idea how big the field is based on my wanderings. In other words, why can't we get the area from properties of the random walk. This could simply be pushing the analogy too far. A person wandering is not undergoing a truly random walk. They would likely walk in single direction until hitting an edge and then change direction. So imagine riding an out-of-control Segway around a field - would you still get a sense of the size of the field.

Additionally, a person is able to see the edges. So imagine being in a corn field (I am in central Illinois, after all - others can imagine a forest, with very soft trees) on an out-of-control Segway. Would it still be possible to get an idea of how big the field is based on the path taken and how often you bump into the edge?

So I'd like to look into some of the properties of random walks - how much information can be extracted. Are there theoretical limits? Can we change the random walk or come up with different estimators that would let us compute the area?

In order to investigate concrete but simple systems, I'm thinking of a series of systems to make the problem increasingly realistic for physical systems. The simplest and easiest to think about the area of an irregular region. More realistic, the area of the region is much less that the area of the enclosing square, and the region is long and snaky. Then go to higher dimensions, and eventually to a system of hard spheres. To be more generally applicable, the next step would be a general potential (think of a hilly field and we want to know the volume of soil is in the field)

Monday, November 10, 2003

Interesting notes

While searching for Monte Carlo information on the web, I ran across Jonathon Goodman's Monte-Carlo Course Materials. One interesting part was Alan Sokal's introductory notes near the bottom (see also this CiteSeer link). I liked these notes because they covered a practical method for computing the autocorrelation time.

Sunday, November 02, 2003

Optimization

Recently there were two papers in the Journal of Chemical Physics related to QMC optimization. The first, Geometry optimization in quantum Monte Carlo with solution mapping: Application to formaldehyde, is concerned with optimization of nuclear positions using only energy information, and not gradient (force) information, since it is noisier and harder to obtain. They use statistical experiment design techniques to pick coordinates at which to perform the energy computations, and fit that data to a quadratic form. The the geometry was then optimized using that fitted form.

The second paper is concerned with optimizing the one-body part of the wavefunction: Transcorrelated method for electronic systems coupled with variational Monte Carlo calculation. The authors use a transcorrelated Hamiltonian (some of the correlation has been moved into the Hamilitonian), and compute the SCF solution for the one-body orbitals. They use regular VMC to optimize the Jastrow factors. Note that the two calcuations are coupled, and to reach a solution the procedure is to alternately optimize the one-body and Jastrow factors.

Wednesday, October 08, 2003

Monte Carlo on the Graphics Card

A recent short paper on The Graphics Card as Streaming Computer (see related presentation slides) got me wondering if Monte Carlo could be programmed onto a graphics card.

The proposed Protein Explorer from Japan, using a GRAPE type architecture, follows a similar programming model.

One limitation is the precision of the calculations. Single precision at best, some cards only have 8-16 bits. The penalty method or related methods may be useful for correcting errors due to finite precision. (See also Ball, Fink, and Bowler)

Another limitation is the size and complexity of the code that can execute on the graphics card. This might be more useful for simple classical Monte Carlo simluations that QMC.

I suspect this falls into the category of "cool ideas that would be fun to play with, but spending the time researching better algorithms is probably more effective".

Thursday, October 02, 2003

Of Wavefunctions and Wikis

and mailing lists (but that didn't alliterate nicely in the title).

There's a new QMC Mailing list in town. Be the first on your block to sign up.

Also there's a QMC Wiki (A wiki is an easily editable set of community web pages) (No link yet - you'll have to sign up for the mailing list and look at the Sept. archives to get the address)

Both of these are brought to you by Alan Aspuru-Guzik.

Thursday, September 11, 2003

Linear charged particle simulations

Today's paper is by Joerg Rottler and A.C. Maggs, entitled "A Continuum,O(N) Monte-Carlo algorithm for charged particles". There are some earlier papers in the references where this method was developed for lattice systems.

As near as I can tell, the algorithm works by writing the total energy in terms of the electric field rather than the electric potential. The electric field is put on a grid. Particle moves then only require local updates to the electric field grid.

The electrostatic interaction is generated as a result of minimizing the energy functional (functional of the electric field). The algorithm they give is supposed to recover the proper interaction by a Monte Carlo means. This is the part I don't quite understand yet.

It seems like a good example of exploiting Monte Carlo to reduce the complexity of a problem by solving it on average.

Wednesday, July 23, 2003

Linear scaling QMC

"Linear scaling of the local energy in quantum Monte Carlo" JCP 119, 1307 by Manten and Luchow.

Cutting off the orbitals and using sparse matrix computations for the determinant, and cutting of the Jastrow factor are the time consuming parts.

One interesting point - they use a topological cutoff for the orbitals rather than a strictly spatial cutoff.

This is the scaling of the local energy with the number of electrons - I'm curious how the correlation time varies with system size.

Wednesday, June 18, 2003

50th Anniversary of the Metropolis Algorithm

Last week I attended a conference on the 50th Anniversary of the Metropolis Algorithm at Los Alamos. To give proper credit to all the authors, it should be the Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller (MRRTT) method, but most people just call it the "Metropolis method" (or algorithm).

I enjoyed meeting people at the conference, learning more about various techniques (look for future posts), and visiting New Mexico (my first time there).

One of the sessions was a history of the Metropolis method, and Marshall Rosenbluth gave a talk. It was good to hear from one of the original authors.

Monday, May 12, 2003

Convergence Diagnostics

A while ago I expressed a desire for better automatic control of Monte Carlo simulations. The magic keyword seems to be "convergence diagnostics".

Statistics practitioners discovered that the Metropolis method (and other Markov Chain methods) were useful for integrals in Bayesian analysis. (handy overview)
I suppose being statistics researchers, they worried more about convergence more that physicists(?) Note there are two convergence worries - has the algorithm reached the long-time limit (stationary distribution), and has it converged to the desired distribution?

The most recent review of the various convergence diagnostics that I've found is by Brooks and Roberts (1999). Not all the diagnostics are applicable to Metropolis sampling (or Metropolis-Hastings, as generalized Metropolis is called).

The other interesting item to look at is "convergence acceleration" and see if any those techniques are useful for physics simulations.

Tuesday, April 22, 2003

Randomness as a professional disease

In the May/June 2003 issue of Computing in Science and Engineering, Dietrich Stauffer has an article on Sociophyics Simulations (p. 71):

"Statistical physicists, of course, assume that these hierarchies arise from randomness (The illusion that everything is random is a professional disease - morbus Boltzmann - among such physicists, just as silicosis, or black lung, affects mine workers)"

Saturday, April 19, 2003

Generating trial moves cheaply

Today's article is "Monte Carlo simulations using sampling from an approximate potential" by Lev Gelb (JCP 118, 7747). He describes using an approxmiate potential to make a series of (cheap) moves and then accept or reject that series of moves based on the desired (correct, expensive) potential. This is similar to (but not quite the same as) Multilevel Monte Carlo in other work. The focus of multilevel sampling is early rejection of ultimately undesirable trial moves. The focus of Gelb's method seems to be creating uncorrelated trial moves to evaluate with the expensive potential.

I'm a little disappointed that the author didn't reference the similar multilevel Monte Carlo used in PIMC (RMP 67, 279) (see section V.F, p. 329) and VMC (JCP 113, 5123).

On page 7749, he mentions that an empirical potential could be used as the approximate potential and an ab initio potential for the correct potential. This has been done, sort of. (Also in Recent Advances in Quantum Monte Carlo Methods II (World Scientific 2002) or here .) I called the technique "pre-rejection". Consider using Gelb's method, but make only a single trial move using the approximate potential. If that trial move is rejected, then the energy change is clearly zero, and we know the move will be accepted without having to evaluate the expensive potential. This shortcut can be extended to more trial moves, but its value decreases dramatically as the probability that all the trial moves (with the approximate potential) will be rejected becomes very small.

Gelb applied his method to a Lennard-Jones fluid where the potential is cut and shifted. The approximate potential is the potential with a short cutoff and the correct potential has a larger cutoff.

Table I shows speedups and energies (and pressures and heat capacities) with error bars. There seems to a tradeoff - more speed up lowers the acceptance ratio and leads to larger error bars. This can be quantified by computing the efficiency 1/ (T * sigma^2) (or Speedup/sigma^2). Computing efficiencies for system 1 gives

System	Efficiency of Energy (x10⁴)
1(ref)	12
1(a)	14
1(b)	19
1(c)	14
1(d)	25

From the paper, it looks like a,b and c,d have the same parameters of relevance (rc and M) and that this method definitely does increase the efficiency.

Thursday, April 17, 2003

The Quest for Better Wavefunctions

One of the nice features of QMC is the freedom in the choice of wavefunction. The downside is it can be time consuming and unwieldy to optimize them in VMC.

In a recent article - Backflow Correlations for the Electron Gas and Metallic Hydrogen - Holzmann, et al. look at improving wavefunctions through adding analytic information.

The section with the Feynman-Kacs approach uses a cumulant expansion. In case you (read: me) need a reminder of the cumulant expansion, look at this online statistics text (chapter 8.4).

Wednesday, April 09, 2003

Followup to "Testing Monte Carlo"

I wrote a program that uses Simpson's rule to compute the energy of two particles in a box interacting via a Lennard-Jones potential

The web page and program are here

It runs in a second or so on my 1 GHz P3. I expect 3 particles should be doable.

Edit 9/4/2004, fixed the link.

Monday, April 07, 2003

Testing Monte Carlo

You've just written a brand new Monte Carlo code. It's producing vast amounts of data on your fast new computer. How do know if that random looking time series will average to a beautiful result and lead to deep insight, or is it simply random noise from the physics of some other universe?

There are several techniques we can use to verify that the codes we write are working correctly.

Unit testing
This is standard software engineering (or *should* be standard). This involves small drivers to test individual classes or routines. It's good for verifying small, easily understood pieces before integrating into a larger system and for regression testing (making sure the latest change didn't break something).
Solve simple systems with analytic or well-known answers.
The harmonic oscillator is popular. For QMC, H₂ is very well known, and very accurate answers are known from other techniques.
Compute internal quantities in multiple ways.
The best example in QMC is the local energy - compute the derivatives analytically and numerically. In PIMC there are multiple energy estimators.
Compute system values with another integration method.
It seems that computers are fast enough that we should be able to use grid-based methods to compute answers for small numbers (1-3) of particles.
For QMC, shutting off the electron-electron interaction simplifies the problem, and Mathematica can handle the resulting integrals (numerically, not symbolically)
Comparison with literature
Compare with experiment and answers obtained by other methods.

Friday, March 28, 2003

Do you believe your error bars?

Computing uncertainty estimates from Monte Carlo data usually assumes the data is normal (or can be sufficiently reblocked to be normal, according to the central limit theorem). But what if we don't have quite that much data - what effect would the underlying distribution have on the uncertainty estimates (error bars)?

Mervlyn Moodley looks at this question in the paper The Lognormal Distribution and Quantum Monte Carlo Data.

The interesting figures seem to be figures 3 and 9. In Figure 3, we see the confidence intervals change as the data is reblocked (and hence comes closer to normal). Figure 9 is generated from a real simulation, and it seems there is very little difference between the error bars computed assuming normality and those taking the underlying lognormal distribution into account.

Monday, March 17, 2003

Bilinear QMC

I'm reading a recent paper by Arias de Saavedra and Kalos titled "Bilinear diffusion quantum Monte Carlo methods" (PRE 67, 026708). They present an algorithm that uses a pair of walkers (the bilinear part) to sample the square of the ground state wavefunction (rather than first power of the wavefunction like DMC).

They claim it's useful for computing unbiased expectation values (see my post from Monday, Feb. 24 - although I'm not sure it would help with derivative operators) and energy differences.

The part I find most intriguing is the way they use importance sampling - they get reasonable results for hydrogen without it. And when they do use importance sampling, it's only to remove singularities due to cusps. I would guess that any successful scheme (ie, able to scale reasonably with system size) needs to use as an accurate a trial wavefunction as possible to guide the sampling.

The more I study the paper, the less I understand it. I guess the first step is to try reproduce their results for the 1-D harmonic oscillator.

Tuesday, March 11, 2003

Steal this research idea

Automatic control of Monte Carlo simulations - I want to know how to do it, but I don't necessarily want to figure it out myself. :)

Determining equilibrium and phase transitions.
Typically the particles are started in a lattice or random configuration and let run a whlie to equilibrate. I want automatic detection of equilibrium, which is especially tricky near phase boundaries. Also near phase transitions, the system can suddenly change phases.
Adjustment of parameters for optimum efficiency
The key quantity affecting the efficiency is the correlation time, but it's very noisy. Is there some other quantity that we could use that would behave similarly, but with less noise? Since this will only affect the efficiency, approximations are quite okay. Like assuming some analytic form for the autocorrelation function with a small number of parameters and fitting to it?

Also note that SGA-like algorithms can be used for control as well as optimization. The decay parameter reaches a constant to follow the system, rather than continuing to decrease, as in optimization.
Better control of DMC population

Occasionally, I have trouble with the number of walkers increasing rapidly. In my codes, this means it exceeds some maximum value and stops the program. Sometimes the number of walkers jumps to a large but stable value, and the energy becomes far too low and clearly wrong. The hard part is that population control introduces a bias, and can't be too intrusive.

For these items, are there any concepts from control theory or signal processing that would be useful?

Tuesday, March 04, 2003

Dynamic Monte Carlo

Checking the arxiv.org preprint server, there was an article titled An Introduction To Monte Carlo Simulations Of Surface Reactions by A.P.J. Jansen, which describes Dynamic Monte Carlo methods (Kinetic Monte Carlo is similar). One important input is transition probabilities. On page 21, "Estimates of the error made using DFT for such systems are at least about 10 kJ/mol."

Can QMC do better?

Friday, February 28, 2003

VMC Optimization: SGA vs. fixed sample reweighting

In VMC, we parameterize the wave function and vary the parameters until the minimum energy (or variance) is reached. Fixed sample reweighting generates a number of configurations from the MC process, and then uses a least squares minimization algorithm and a reweighting estimate for the energy to minimize the variance of the energies.

The Stochastic Gradient Approximation (SGA) takes a noisy estimate of the gradient and moves in that direction ("downhill"). The amount it moves is scaled by a variable that decreases each iteration (usually 1/n). It is guaranteed to converge, but it may take a while.

I don't like the fixed sample reweighting (FSR) method on aesthetic grounds. SGA feels more in line with the "Monte Carlo spirit". FSR feels like it is taking very coarse steps: sample configurations for a while, stop and do a minimization, repeat these two steps for a few iterations. Each step is large, and results in significant improvement. The SGA, on the other hand takes a large number of small steps. Each individual step doesn't give much improvement (or any improvement at all), but the steps accumulate to give a good answer.

Aesthetic considerations aside, FSR requires no analytic gradients for efficient implementation. SGA needs analytic gradients for most of the parameters (you could do numerical gradients, but the time gets much worse)

FSR does variance minimization, and SGA does energy minimization - there may be some differences in the resulting quality of the wave functions (JCP 112, 4935)

Monday, February 24, 2003

Derivative operators and forward walking in DMC

Using Diffusion Monte Carlo (DMC) to compute estimators other than the energy leads to a "mixed estimator"- an average that combines the trial wavefunction and the exact ground state (actually the fixed node ground state) - not what we really want. However, the number of children of each walker after a long time is proportional to the ratio of the fixed node ground state to the trial wavefunction. This leads to the forward walking procedure for obtaining correct averages over the fixed node ground state. See Casulleras and Boronat ( PRB 52, 3654) for examples, references, and implementation details.

Numerical values for derivative operators, such as the kinetic energy, are computed by applying the operator to the trial wavefunction. The forward walking procedure only corrects the distribution in the average, not the values of the estimator. (we really want the derivative applied to the exact wavefunction, not the trial wavefunction).

Computing the kinetic energy (and the pressure) by naively doing forward walking on the kinetic energy is incorrect. For the kinetic energy, we can use the total energy and the forward walked potential energy to obtain the kinetic energy.

Thursday, February 20, 2003

Use Bennett's method for free energy calculations

That's the conclusion that Nandou Lu, Jayant Singh, and David Kofke reach in a recent paper (
JCP 118, 2977)

Bennett's method requires an estimate of the free energy, but that can be obtained via iteration. The authors argue that even without fully optimizing the solution, the free energy produced is still better than other methods based on overlap sampling.

Tuesday, February 18, 2003

Coupled Electronic-Ionic Monte Carlo (CEIMC)

Time for a little shameless self promotion. My thesis work combined a classical MC calculation of H₂ molecules with a QMC calculation of the electronic (molecular interaction) energy (like the Car-Parrinello method combined classical molecular dynamics of the nuclei with a Density Functional Theory (DFT) calculation of the electronic energy). The F77 code to do molecular H₂ calculations is here.

There is a problem in the chapter on VMC optimization - I did the fixed sample reweighting with a general optimizer rather than one specifically for least squares problems. This makes a difference in the run time. A corrected version of the data using a least squares optimizer can be found in this article.

Edit, 2/10/2004 - corrected link to thesis

Monday, February 17, 2003

Introduction to QMC

Quantum Monte Carlo is a set of techniques used in chemistry and physics to solve the Schrodinger equation (which describes the interactions between electrons and nuclei to make atoms and molecules).
The "Monte Carlo" part refers to the use of stochastic (random number) solution methods.

For more information, see

Condensed Matter Physics group at NCSA, headed by David Ceperley. (I was a member of this research group)
QMC group at Cambridge
Molecular Monte Carlo Home page - useful set of classical and quantum Monte Carlo links

This is an experiment in research/publishing/making useful information available on the web.

Quantum Monte Carlo